Show code
library(ggdag)
dag <-
dagify(
W ~ N + p,
L ~ N + p,
coords = time_ordered_coords()
)
ggdag(dag) + theme_dag()
The globe tossing example from chapter two presents an interesting example of causal influence.

Here, N represents the number of times a globe is tossed, with W being the number of times the globe is caught with your finger on water, and L being the catches on land. p represents the proportion of the earth’s surface covered by water.
The DAG says the “the proportion of the earth covered by water and the number of tosses influence the both number of water catches and the number of land catches,” an obvious and profound result.
For each possible explanation of the sample,
Count all the ways the sample could happen.
Explanations with more ways to produce the sample are more plausible.
Statistical Rethinking 2023 - 02 - The Garden of Forking Data (9:22)
Bayesian models learn and adjust their estimates when reading new data:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as st
x = np.linspace(0, 1)
fig, axes = plt.subplots(3, 3, figsize=(10, 10))
axes = axes.flatten()
# 1 = Water, 0 = Land
tosses = [1, 0, 0, 1, 0, 1, 1, 1, 1]
label = ""
# Calculate the maximum y-value across all distributions
max_y = 0
for i in range(1, len(tosses) + 1):
W = sum(tosses[0:i])
L = i - W
y_vals = st.beta.pdf(x, W + 1, L + 1)
max_y = max(max_y, np.max(y_vals))
# Add a small margin to the max_y for plotting
max_y *= 1.1
# Reset label for plotting
label = ""
# each iteration represents a learning iteration
for i in range(1, len(tosses) + 1):
if tosses[i - 1] == 0:
label += "L"
else:
label += "W"
W = sum(tosses[0:i])
L = i - W
axes[i - 1].axvline(x=0.71, color="red", linestyle="dashed");
axes[i - 1].plot(x, st.beta.pdf(x, W + 1, L + 1), "b-");
axes[i - 1].set_title(label);
axes[i - 1].set_xlabel("p");
axes[i - 1].set_ylabel("Density");
axes[i - 1].set_ylim(0, max_y);
plt.tight_layout()
plt.show()
Instead of the model “learning” the posterior, you can sample from the posterior directly:
from IPython.display import HTML
import matplotlib.animation as animation
fig, ax = plt.subplots(1, 3, figsize=(10.5, 3.5))
post_preds = []
def hmcmc(frame, W=6, L=3):
# Clear axes before replotting
for ax_ in ax:
ax_.clear()
# Sample from Posterior
sample = st.beta.rvs(W + 1, L + 1)
post_preds.append(sample * (W + L))
# Plot 1: The Beta PDF + Current Sample
ax[0].plot(x, st.beta.pdf(x, W + 1, L + 1), "b-")
ax[0].axvline(x=sample, color="red", linestyle="dashed", alpha=0.6)
ax[0].set_title(f"Analytical PDF (Sample: {sample:.2f})")
# Plot 2: Predictive Distribution (Binomial)
pred_samples = st.binom.rvs(n=W + L, p=sample, size=1000)
ax[1].hist(
pred_samples,
bins=np.arange(W + L + 2) - 0.5,
color="skyblue",
edgecolor="black",
)
ax[1].set_title("Predictive Distribution")
ax[1].set_ylim(0, 350)
# Plot 3: Posterior Predictive
ax[2].hist(
post_preds,
bins=np.arange(W + L + 2) - 0.5,
color="orange",
edgecolor="black",
)
ax[2].set_title(f"Posterior Predictive (N={len(post_preds)})")
ax[2].set_ylim(0, 45)
ani = animation.FuncAnimation(fig=fig, func=hmcmc, frames=100, interval=50)
plt.close()
HTML(ani.to_jshtml())